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We propose a two-sample test for the means of high-dimensional 
data when the data dimension is much larger than the sample size. 
Hotelling's classical test does not work for this "large p, small n" 
situation. The proposed test does not require explicit conditions in 
the relationship between the data dimension and sample size. This 
offers much flexibility in analyzing high-dimensional data. An appli- 
cation of the proposed test is in testing significance for sets of genes 
which we demonstrate in an empirical study on a leukemia data set. 

1. Introduction. High-dimensional data are increasingly encountered in 
many applications of statistics and most prominently in biological and fi- 
nancial studies. A common feature of high-dimensional data is that, while 
the data dimension is high, the sample size is relatively small. This is the 
so-called "large p, small n" phenomenon where p/n^ oo; here p is the data 
dimension and n is the sample size. The high data dimension ("large p") 
alone has created the need to renovate and rewrite some of the conven- 
tional multivariate analysis procedures; these needs only get much greater 
for "large p small n" situations. 

A specific "large p, small n" situation arises when simultaneously testing 
a large number of hypotheses which is largely motivated by the identification 
of significant genes in microarray and genetic sequence studies. A natural 
question is how many hypotheses can be tested simultaneously. This paper 
tries to answer this question in the context of two-sample simultaneous tests 
for means. Consider two random samples Xn, . . . ,Xin- G W for i = 1 and 
2 which have means /ii = (/in, . . . ,^J.ip)'^ and fi2 = (/^2i, • • • ,IJ'2p)'^ and co- 
variance matrices Si and S2, respectively. We consider testing the following 
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high-dimensional hypothesis: 

(1.1) Ho:fii=fi2 versus Hi:fii^fi2- 

The hypothesis Hq consists of the p marginal hypotheses Hqi : fin = fi2i for 
I = 1, . . . ,p regarding the means on each data dimension. 

There have been a series of important studies on the high-dimensional 
problem. Van der Laan and Bryan (2001) show that the sample mean of p- 
dimensional data can consistently estimate the population mean uniformly 
across p dimensions if log(p) = o(n) for bounded random variables. In a ma- 
jor generalization, Kosorok and Ma (2007) consider uniform convergence for 
a range of univariate statistics constructed for each data dimension which 
includes the marginal empirical distribution, sample mean and sample me- 
dian. They establish the uniform convergence across p dimensions when 
log(p) = o(n"^/^) or log(p) = o{n^^^), depending on the nature of the marginal 
statistics. Fan, Hall and Yao (2007) evaluate approximating the overall level 
of significance for simultaneous testing of means. They demonstrate that 
the bootstrap can accurately approximate the overall level of significance if 
log(p) = o{n^^^) when the marginal tests are performed based on the nor- 
mal or the t-distributions. See also Fan, Peng and Huang (2005) and Huang, 
Wang and Zhang (2005) for high-dimensional estimation and testing in semi- 
parametric regression models. 

In an important work, Bai and Saranadasa (1996) propose using \\Xi — 
X2II to replace {Xi — X2)'^ S~^{Xi — X2) in Hotelling's T^-statistic where 
Xi and X2 are the two sample means, Sn is the pooled sample covariance 
by assuming Si = S2 = S and || • || denotes the Euclidean norm in R^. They 
establish the asymptotic normality of the test statistics and show that it has 
attractive power property when p/n — )■ c < 00 and under some restriction on 
the maximum eigenvalue of S. However, the requirement of p and n being of 
the same order is too restrictive to be used in the "large p small n" situation. 

To allow simultaneous testing for ultra high-dimensional data, we con- 
struct a test which allows p to be arbitrarily large and independent of 
the sample size as long as, in the case of common covariance S, tr(S^) = 
o{tr^(S^)} where tr(-) is the trace operator of a matrix. The above condition 
on T, is trivially true for any p if either all the eigenvalues of S are bounded 
or the largest eigenvalue is of smaller order of {p — 6)^/^6~^/^ where b is the 
number of unbounded eigenvalues. We establish the asymptotic normality of 
a test statistic which leads to a two-sample test for high-dimensional data. 

Testing significance for gene-sets rather than a single gene is the latest 
development in genetic data analysis. A critical need for gene-set testing is to 
have a multivariate test that is applicable to a wide range of data dimensions 
(the number of genes in a set). It requires P- values for all gene-sets to allow 
procedures based on either the Bonferroni correction or the false discovery 
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rate [Benjamini and Hochberg (1995)] to take into account the multiplicity 
in the test. We demonstrate in this paper how to use the proposed test for 
testing significance for gene-sets. An advantage of the proposed test is in its 
readily producing P-values of significance for each gene-set under study so 
that the multiplicity of multiple testing can be taken into consideration. 

The paper is organized as follows. We outline in Section 2 the framework 
of the two-sample tests for high-dimensional data and introduce the pro- 
posed test statistic. Section 3 provides the theoretical properties of the test. 
How to apply the proposed test of significance for gene-sets is demonstrated 
in Section 4 which includes an empirical study on an acute lymphoblastic 
leukemia data set. Results of simulation studies are reported in Section 5. 
All the technical details are given in Section 6. 

2. Test statistic. Suppose we have two independent and identically dis- 
tributed random samples in R^, 

{Xii ,Xi2,..., Xini } ''~ ■ Fi for i = 1 and 2, 

where Fi is a distribution in RP with mean and covariance Sj. A well- 
pursued interest in high-dimensional data analysis is to test if the two high- 
dimensional populations have the same mean or not namely 

(2.1) Ho:fii=fi2 vs. Hi: 111^^12- 

The above hypothesis consists of p marginal hypotheses regarding the means 
of each data dimension. An important question from the point view of mul- 
tiple testing is how many marginal hypotheses can be tested simultaneously. 
The works of van der Laan and Bryan (2001), Kosorok and Ma (2007) and 
Fan, Hall and Yao (2007) are designed to address the question. The existing 
results show that p can reach the rate of e""''' for some positive constants a 
and /3. In establishing a rate of the above form, both van der Laan and Bryan 
(2001) and Kosorok and Ma (2007) assume that the marginal distributions 
of Fi and F2 are all supported on bounded intervals. 

Hotelling's test is the conventional test for the above hypothesis when 
the dimension p is fixed and is less than n =: ni + n2 — '2 and when Si = 
S2 = S, say. Its performance for high-dimensional data is evaluated in Bai 
and Saranadasa (1996) when p/n — )• c G [0, 1) which reveals a decreasing 
power as c gets larger. A reason for this negative effect of high-dimension 
is due to having the inverse of the covariance matrix in the statistic. 
While standardizing by the covariance brings benefits for data with a fixed 
dimension, it becomes a liability for high-dimensional data. In particular, the 
sample covariance matrix Sn may not converge to the population covariance 
when p and n are of the same order. Indeed, Yin, Bai and Krishnaiah (1988) 
show that when p/n — )■ c, the smallest and the largest eigenvalues of the 
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sample covariance Sn do not converge to the respective eigenvalues of S. The 
same phenomenon, but on the weak convergence of the extreme eigenvalues 
of the sample covariance, is found in Tracy and Widom (1996). When p> n, 
Hotelling's statistic is not defined as Sn may not be invertible. 

Our proposed test is motivated by Bai and Saranadasa (1996), who pro- 
pose testing hypothesis (2.1) under Si = S2 = S based on 

(2.2) Mn = {X, - X2)'(Xi - X2) - rtr(5„), 

where Sn = ^ ELi Ef=ii^ij " Xi){Xi, - X^)' and r = The key fea- 

ture of the Bai and Saranadasa proposal is removing in Hotelling's 
since having is no longer beneficial when p/n — )• c > 0. The subtraction 
of tr(5„) in (2.2) is to make E{Mn) = H/^i — /U2|P- The asymptotic normality 
of Mn was established and a test statistic was formulated by standardizing 
Mn with an estimate of its standard deviation. 

The following are the main conditions assumed in Bai-Saranadasa's test: 

(2.3) p/n^c< 00 and Ap = o(p^/^); 

(2.4) ni/{ni + n2)^ke{0,l) and (^n - Ai2)'S(/zi - //s) = o{tr(s2)/?z}, 

where Ap denotes the largest eigenvalue of S. 

A careful study of the M„ statistic reveals that the restrictions on p and 
n, and on Ap in (2.3) are needed to control terms Ej=i -^[j-^iji ^ = 1 and 2, 
in \\Xi — However, these two terms are not useful in the testing. To 

appreciate this point, let us consider 

ni(ni-l) n2(n2-l) nin2 

after removing ^^^iX'-Xij for i = 1 and 2 from — X2IP. Elementary 
derivations show that 

E{Tn) = Will - )U2|P. 

Hence, T„ is basically all we need for testing. Bai and Saranadasa used tr(S'.„) 
to offset the two diagonal terms. However, tr(S'„) itself imposes demands on 
the dimensionality too. 

A derivation in the Appendix shows that under Hi and the condition 
in (3.4), 

Var(r„) = j^— ^tr(S2) + — tr(Si) + ^tr(SiS2)|{l + o(l)}, 
[ni(ni-l) ?^-2(?^2-l) ^1712 J 

where the o(l) term vanishes under Hq. 
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3. Main results. We assume, like Bai and Saranadasa (1996), the follow- 
ing general multivariate model: 

(3.1) Xij =TiZij + fii for j = 1, . . . ,nj, z = 1 and 2, 

where each Fj is a p x m matrix for some m>p such that TiT'- = Sj, and 
{Zij}j!^i are m-variate independent and identically distributed (i.i.d.) ran- 
dom vectors satisfying E{Zij) = 0, Yax{Zij) = Im, the mxm identity matrix. 
Furthermore, if we write Zij = (zjji, . . . , Zijm)', we assume E{zfjf,) = 3 -|- A < 
oo, and 

(3-2) • • • 41) = ^(4U^(4i) • • • ^(41) 

for a positive integer q such that Yl'i=i "^z ^ 8 and h ^ h ^ • • • Iq- Here 
A describes the difference between the fourth moments of Ziji and A^(0, 1). 
Model (3.1) says that Xij can be expressed as a linear transformation of a 
m-variate Zij with zero mean and unit variance that satisfies (3.2). Model 
(3.1) is similar to factor models in multivariate analysis. However, instead of 
having the number of factors m < p in the conventional multivariate analysis, 
we require m>p. This is to allow the basic characteristics of the covariance 
Sj, for instance its rank and eigenvalues, to not be affected by the trans- 
formation. The rank and eigenvalues would be affected if m < p. The fact 
that m is arbitrary offers much flexibility in generating a rich collection 
of dependence structure. Condition (3.2) means that each Zij has a kind of 
pseudo-independence among its components {ziji}'^^. Obviously, if Zij does 
have independent components, then (3.2) is trivially true. 

We do not assume Si = S2, as it is a rather strong assumption, and most 
importantly such an assumption is harder to be verified for high-dimensional 
data. Testing certain special structures of the covariance matrix when p and 
n are of the same order have been considered in Ledoit and Wolf (2002) and 
Schott (2005). 

We assume 

(3.3) ni/(77,i -|- 712) — )■ A; G (0, 1) asn— )-oo, 

(3.4) (/xi-/X2)'S,(^i-^2) = o[n-itr{(Si + S2)2}] for i = 1 or 2, 

which generalize (2.4) to unequal covariances. Condition (3.4) is obviously 
satisfied under Hq and implies that the difference between fii and /U2 is small 
relative to tr{(Ei -|- ^2)^} so that a workable expression for the variance 
of T„ under Hq, and the specified local alternative can be derived. It can 
be viewed as a high-dimensional version of the local alternative hypotheses. 
When p is fixed, if we use a standard test for two population means, for 
instance Hotelling's test, the local alternative hypotheses has the form 
of Hi — ^2 = Tn~^^^ for a nonzero constant vector r G R^. Hotelling's test 
has nontrivial power under such local alternatives [Anderson (2003)]. If we 
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assume each component of ^ui — /X2 is the same, say 5, then the local alter- 
natives imply 5 = 0(n~^/^) for a fixed p. When the difference is o{n~^^'^), 
Hotelling's test has nonpower beyond the level of significance. 

To gain insight into (3.4) for high-dimensional situations, let us assume all 
the eigen-values of Sj are bounded above from infinity and below away from 
zero so that Sj = /p is a special case of such a regime. Let us also assume, 
like above, that each component of /ii — 1^2 is the same as a fixed 5, namely 
Ml/ — ^^2l = ^ for 1 = 1,... ,p. Then (3.4) implies 5 = o(n~^/^) which is a 
smaller order than 6 = 0(n~^/^) for the fixed p case. This can be understood 
as the high-dimensional data (59 — )• 00) contain more data information which 
allows finer resolution in differentiating the two means in each component 
than that in the fixed p case. 

To understand the performance of the test when (3.4) is not valid, we 
reverse the local alternative condition (3.4) to 

(3.5) n~^tr{{T.i + T,2f} = o{{fj.i- fi2y^i{fJ-i- fJ'2)} for i = 1 or 2, 

implying that the Mahanalobis distance between /ii and /i2 is a larger order 
than that of tr{(Si -|- ^2)^}. This condition can be viewed as a version of 
fixed alternatives. We will establish asymptotic normally of Tn under either 
(3.4) or (3.5) in Theorem 1. 

The condition we impose on p to replace the first part of (2.3) is 

(3.6) tr(S,SjS;S;,)=o[tr2{(Si + S2)2}] ioi i,j,l,h= 1 or 2, 

as p ^ 00. To appreciate this condition, consider the case of Si = S2 = S. 
Then (3.6) becomes 

(3.7) tr(S^)=o{tr2(s2)}. 

Let Ai < A2 < • • • < Ap be the eigenvalues of S. If all eigenvalues are bounded, 
then (3.7) is trivially true. If, otherwise, there are b unbounded eigenvalues 
with respect to p, and the remaining p — b eigenvalues are bounded above by 
a finite constant M such that (p — 6) — ?• 00 and (p — 6)Af — >• 00, then sufficient 
conditions for (3.7) are 

(3.8) Ap = o{(p-6)V2Ai6"i/4} or Xp = o{ip - b)^/^Xy''x'Jl_^,}, 

where b can be either bounded or diverging to infinity, and the smallest 
eigen-value Ai can converge to zero. To appreciate these, we note that 

tr(S4) jp - b)M^ + bX^ 

1^2(^2) < _ + b^xl_,^, + 2{p - b)bxlxl^,^, ■ 

Hence, the ratio converges to under either condition in (3.8). 

The following theorem establishes the asymptotic normality of Tn. 
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Theorem 1. Under the assumptions (3.1), (3.2), (3.3), (3.6) and either 
(3.4) or (3.5), 



|/"1-/"2|P d 



A/Var(r„) 



A^(0,1) asp^oo and n ^ oo. 



The asymptotic normality is attained without imposing any exphcit re- 
striction between p and n directly. The only restriction on the dimension is 
(3.6) or (3.7). As the discussion given just before Theorem 1 suggests, (3.7) 
is satisfied provided that the number of divergent eigenvalues of S are not 
too many, and the divergence is not too fast. The reason for attaining this in 
the case of high-data-dimension is because the statistic T„ is univariate, de- 
spite the fact that the hypothesis Hq is of high dimension. This is different 
from using a high-dimensional statistic. Indeed, Portnoy (1986) considers 
the central limit theorem for the p-dimensional sample mean X and finds 
that the central limit theorem is not valid if p is not a smaller order of y/n. 

As shown in Section 6.1, Var(T„) = (7^{1 + o(l)} where, under (3.4), 

(3.9) al =: al, = ^ tr(£f) + , ^ ^. tr(Si) + ^tr(SiS2) 

ni(ni - 1) n2[n2 - 1) nin2 

and under (3.5), 

(3.10) al =: 0-^2 = — (M1 - /i2)'Sl(/il - //2) + — (a^I - /U2)'S2(/il - /i2)- 

In order to formulate a test procedure based on Theorem 1, a^-^ in (3.9) needs 
to be estimated. Bai and Saranadasa (1996) used the following estimator for 
tr(S2) under Si = S2 = S: 

r .2 1 



tr(S2) = trS^ - -(tr5„' 

^ ^ {n + 2){n-l)\ " ' 

Motivated by the benefits of excluding terms like ^ij^ij in the for- 

mulation of T„, we propose the following estimator of tr(S?) and tr(SiS2): 

t7(sf) = {n,(n, - 1)}-1 tr I f;(X,, - X,(,,,))X^(X,, - X,(,,,))X;, I 



and 

tr(SrS2) = (nin2)-nr<' J] J](Xi, - Xi(;))X(;(X2fc - X2(fc))X^, 



ni 712 



. 1=1 k=l 



where Xn^j^^,-^ is the iih. sample mean after excluding Xij and Xj^, and ^i(i) 
is the ith sample mean without Xn. These are similar to the idea of cross- 
validation, in that when we construct the deviations of Xij and Xi^ from 
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the sample mean, both Xij and Xik are excluded from the sample mean 

calculation. By doing so, the above estimators tr(S?) and tr(SiS2) can be 
written as the trace of sums of products of independent matrices. We also 
note that subtraction of only one sample mean per observation is needed in 
order to avoid a term like which is harder to control asymptotically 

without an explicit assumption between p and n. 

The next theorem shows that the above estimators are ratio-consistent to 
tr(S2) and tr(SiIl2), respectively. 

Theorem 2. Under the assumptions (3.1)-(3.4) and (3.6), for i = l or 

2, 

t7(sf) p tr(S^2) P , 

— -— 7T- — >■ 1 and — ,^ ^ , — )• 1 aspandn^oo. 
tr(S2) tr(SiS2) 

A ratio-consistent estimator of cr^^ under Hq is 

ni(ni-l) n2{n2-l) nin2 

This together with Theorem 1 leads to the test statistic, 

Qn = Tn/cFni -^(0, 1) as p and n — )• oo. 



under Hq. The proposed test with an a level of significance rejects Hq if 
Qn > where ^q, is the upper a quantile of A^(0, 1). 

Theorems 1 and 2 allow us to discuss the power properties of the proposed 
test. The discussion is made under (3.4) and (3.5), respectively. The power 
under the local alternative (3.4) is 

nk{l - k)\\fj.i - ^2p' 



(3.11) /3nl(||/il-/^2||) = ^' -Ca + 



2tr{S(A;)2} 



where T,{k) = (1 — k)T,i + kT,2 and ^ is the standard normal distribution 
function. The power of Bai-Saranadasa test has the same form if Si = S2 
and if p and n are of the same order. 
The power under (3.5) is 

/5n2(||/Ul -At2||) =^ Ca + =^ 

as <Jn\lon2 0. Substitute the expression for cr„i, and we have 

'nk{\ - A;)||/xi -^2p' 



(3.12) /?„2(||/ii-M2||) = $ 



2tr{S(A;)2} 
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Both (3.11) and (3.12) indicate that the proposed test has nontrivial 
power under the two cases of the alternative hypothesis as long as 



does not vanish to as n and p — )■ oo. The flavor of the proposed test 
is different from tests formulated by combining p marginal tests on Hqi 
[defined after (1.1)] for / = 1, . . . ,p. The test statistics of such tests are usually 
constructed via maxKKp^ni where Tni is a marginal test statistic for Hqi. 
This is the case of Kosorok and Ma (2007) and Fan, Hah and Yao (2007). A 
condition on p and n is needed to ensure (i) the convergence of maxi</<pT^;, 
and (ii) p can reach an order of exp(Q:n'^) for positive constants a and 
j3. Usually some additional assumptions are needed; for instance, Kosorok 
and Ma (2007) assume each component of the random vector has compact 
support for testing means. 

Naturally, if the number of significant univariate hypotheses (^i; ^ ^21) 
is a lot less than p, which is the so-called sparsity scenario, a simultane- 
ous test like the one we propose may encounter a loss of power. This is 

actually quantified by the power expression (3.11). Without loss of general- 
ly ilY 

ity, suppose that each /Uj can be partitioned as (/u^ )' so that under 

Hi : /x^^^ = ii"2^ and /.f^^^ 7^ ^^'^ where /i^^^ is of pi-dimensional and p!p is 
of p2-dimensional and pi +P2=P- Then — =P2^'^ for some positive 
constant (5^. Suppose that Amo be the smallest nonzero eigenvalue of S(fc). 
Then under the local alternative (3.4), the asymptotic power is bounded 
above and below by 

^/ ^ nk(l - k)p2S'^\ ,,, ^ nk(l - k)p2S'^ 

^{-^a+ </3(||/Ul-/^2||)<'& -ea+ ^ 



V v^Ap J V yj2{p-mQ)\mo. 

If p is very large relative to n and p2 under both high-dimensionality and 
sparsity, so that nk{l — k)p2'rf' / ^J2{p — mo) — )• 0, the test could endure low 
power. With this in mind, we check on the performance of the test under 
sparsity in simulation studies in Section 5. The simulations show that the 
proposed test has a robust power and is in fact more powerful than tests 
based on multiple comparisons with either the Bonferroni or false discovery 
rate (FDR) procedures. We note here that, due to the multivariate nature 
of the test and the hypothesis, the proposed test cannot identify which 
components are significant after the null multivariate hypothesis is rejected. 
Additional follow-up procedures have to be employed for that purpose. The 
proposed test becomes very useful when the purpose is to identify significant 
groups of components like sets of genes, as illustrated in Section 4. The above 
discussion can be readily extended to the case of (3.5) due to the similarity 
in the two power functions. 
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The proposed two-sample test can be modified for paired observations 
{(1^1, li2)}iLi where Yn and 1^2 are two measurements of p-dimensions on 
a subject i before and after a treatment. Let Xj = 1^2 — lii, /U = E{Xi) and 
S = \ai{Xi). This is effectively a one-sample problem with high-dimensional 
data. The hypothesis of interest is 

Ho:fi = vs. ifi : /X / 0. 

We can use F„ = '^^^j X'^Xj / {n{n — 1)} as the test statistic. It is read- 
ily shown that E{Fn) = /i'/x and Var(F„) = tr(i:f){l + o(l)} under 
both Hq and Hi if we assume a condition similar to (3.4) so that fi'J^fj, = 
o{n~"^ tr(S^)}, and the asymptotic normality of Fn by adding tr(S^) = 
o{tr^(S^)}, a variation of (3.6), can be established by utilizing part of 
the proof on the asymptotic normality of T„. The tr(S^) can be ratio- 

consistently estimated with ni replaced by n in tr(S|) which leads to a 
ratio-consistent variance estimation for F„. Then the test and its power can 
be written out in similar ways as those for the two-sample test. 

When p = 0(1), which may be viewed as having finite dimension, the 
asymptotic normality as conveyed in Theorem 1 may not be valid any- 
more. It may be shown under conditions (3.1)-(3.4) without (3.6), as con- 
dition (3.6) is no longer relevant when p is bounded, that the test statistic 
(ni + n2)Tn converges to YlfLi 'Hixf i where {xi are independent Xi dis- 
tributed random variables, and is a set of constants. The conclusion 
of Theorem 2 remains valid when p is bounded. The proposed test can still 
be used for testing in this situation of bounded dimension with estimated 
critical values via estimation of However, people may like to use a 
test specially catered for such such as, for instance, Hotelling's test. 

4. Gene-set testing. Identifying sets of genes which are significant with 
respect to certain treatments is the latest development in genetics research 
[see Barry, Nobel and Wright (2005), Recknor, Nettleton and Reecy (2008), 
Efron and Tibshrini (2007) and Newton et al. (2007)]. Biologically speaking, 
each gene does not function individually in isolation. Rather, one gene tends 
to work with other genes to achieve certain biological tasks. 

Suppose that Si, . . . ,Sg he q sets of genes, where the gene-set Sg consists 
of pg genes. Let FiSg and ^25^ be the distribution functions corresponding 
to Sg under the treatment and control, and fiis^ and fi2Sg be their respective 
means. The hypothesis of interest is 

Hog ■■ fJ-iSg = fJ'2Sg for g=l,...,q. 

The gene sets {Sg}''^^^ can overlap as a gene can belong to several functional 
groups, and pg, the number of genes in a set, can range from a moderate 
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to a very large number. So, there are issues of both multiphcity and high- 
dimensionahty in gene-set testing. 

We propose applying the proposed test for the significance of each gene- 
set Sg when pg is large. When pg is of low-dimension, Hotelling's test may 
be used. Let pvg, g = 1, . . . ,q be the P-values obtained from these tests. 
To control the overall family-wise error rate, we can employ the Bonferroni 
procedure; to control FDR, we can use Benjamini and Hochberg's (1995) 
method or its variations as in Benjamini and Yekutieli (2001) and Storey, 
Taylor and Siegmund (2004). These lead to control of the family- wise error 
rate or FDR in the context of gene-sets testing. In contrast, tests based on 
univariate testing have difficulties in producing P-values for gene-sets. 

Acute lymphoblastic leukemia (ALL) is a form of leukemia, a cancer of 
white blood cells. The ALL data [Chiaretti et al. (2004)] contains microar- 
ray expressions for 128 patients with either T-cell or B-cell type leukemia. 
Within the B-cell type leukemia, there are two sub-classes representing two 
molecular classes: the BCR/ABL class and NEG class. The data set has 
been analyzed by Dudoit, Keles and van der Laan (2008) using a different 
technology. 

Gene-sets are technically defined in gene ontology (GO) system that pro- 
vides structured and controlled vocabularies producing names of gene-sets 
(also called GO terms). There are three groups of gene ontologies of interest: 
biological processes (BP), cellular components (CC) and molecular functions 
(MF). We carried out preliminary screening for gene- filtering using the ap- 
proach in Gentleman et al. (2005), which left 2391 genes for analysis. There 
are 575 unique GO terms in BP category, 221 in MF and 154 in CC for the 
ALL data. The largest gene-set contains 2059 genes in BP, 2112 genes in MF 
and 2078 genes in CC; and the GO terms of the three categories share 1861 
common genes. We are interested in detecting differences in the expression 
levels of gene-sets between the BCR/ABL molecular sub-class (ni = 37) and 
the NEG molecular sub-class (n2 = 42) for each of the three categories. 

We applied the proposed two-sample test with a 5% significance level to 
test each of the gene-sets in conjunction with the Bonferroni correction to 
control the family-wise error rate at 0.05 level. It was found that there were 
259 gene-sets declared significant in the BP group, 110 in the MF group 
and 53 in the CC group. Figure 1 displays the histograms of the P-values 
and the values of test statistic Qn for the three gene-categories. It shows 
a strong nonuniform distribution of the P-values with a large number of 
P-values clustered near 0. At the same time, the Q„-value plots indicate 
the average Q^-values are much larger than zero. These explain the large 
number of significant gene-sets detected by the proposed test. 

The number of the differentially expressed gene-sets may seem to be high. 
This was mainly due to overlapping gene-sets. To appreciate this point, we 
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Fig. 1. Two-sample tests for differentially expressed gene-sets between BCR/ABL and 
NEG class ALL: histograms of P-values (left panels) and Q„-values (right panels) for BP, 
CC and MF gene categories. 



computed for each (say ith) significant gene-set, the number of other signif- 
icant gene-sets which overlapped with it, say 6j; and obtained the average 
of {bi} and their standard deviation. The average number of overlaps (stan- 
dard deviation) for BP group was 198.9 (51.3), 55.6 (25.2) for MF and 41.6 
(9.5) for CC. These number are indeed very high and reveals the gene-sets 
and their P-values are highly dependent. 

Finally, we carried out back-testing for the same hypothesis by randomly 
splitting the 42 NEG class into two sub-classes of equal sample size and 
testing for mean differences. This set-up led to the situation of Hq. Figure 2 
reports the P-values and (5„-values for the three gene ontology groups. We 
note that the distributions of the P-values are much closer to the uniform 
distribution than Figure 1. It is observed that the histograms of Q^-values 
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Fig. 2. Back-testing for differentially expressed gene-sets between two randomly assigned 
NEG groups: histograms of P-values (left panels) and Qn-values (right panels) for BP, 
CC and MF gene categories. 



are centered close to zero and are much closer to the normal distribution 
than their counterparts in Figure 1 which is reassuring. 



5. Simulation studies. In this section, we report results from simulation 
studies which were designed to evaluate the performance of the proposed 
two-sample tests for high-dimensional data. For comparison, we also con- 
ducted the test proposed by Bai and Saranadasa (1996) (BS test), and two 
tests based on multiple comparison procedures by employing the Bonferroni 
and the FDR control [Benjamini and Hochberg (1995)]. The procedure con- 
trols the family-wise error rate at a level of significance a which coincides 
with the significance for the FDR control, the proposed test and the BS test. 
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In the two multiple comparison procedures, we conducted univariate two- 
sample t-tests for the univariate hypotheses Hqi : nn = fi2i vs Hn : jin 7^ fi2i 
for i = 1, 2, . . . ,p. 

Two simulation models for Xij are considered. One has a moving average 
structure that allows a general dependent structure; the other could allocate 
the the alternative hypotheses sparsely which enables us to evaluate the 
performance of the tests under sparsity. 

5.1. Moving average model. The first simulation model has the following 
moving average structure: 

Xijk = PlZijk + P2Zijk+l + h PpZijk+p-i + Pij 

for i = 1 and 2, j = 1,2, ... ,ni and k = 1,2, ... ,p where {Zijk} are, respec- 
tively, i.i.d. random variables. We consider two distributions for the inno- 
vations {Zijk}. One is a centralized Gamma(4, 1) so that it has zero mean, 
and the other is A^(0, 1). 

For each distribution of {Zijk}, we consider two configurations of depen- 
dence among components of Xij. One has weaker dependence with pi = 
for / > 3. This prescribes a "two dependence" moving average structure 
where Xijkj and Xijk2 are dependent only if j/ci — /C2I < 2. The {pi}f^i are 
generated independently from ^7(2, 3) which are pi = 2.883, p2 = 2.794 and 
P3 = 2.849 and are kept fixed throughout the simulation. The second configu- 
ration has all pi^s generated from U (2, 3), and again remain fixed throughout 
the simulation. We call this the "full dependence case." The above depen- 
dence structures assigns equal covariance matrices Si = S2 = S and allows 
a meaningful comparison with the BS test. 

Without loss of generality, we Hx pi = and choose p2 in the same fash- 
ion as Benjamini and Hochberg (1995). Specifically, the percentage of true 
null hypotheses flu = P21 for I = 1, . . . ,p were chosen to be 0%, 25%, 50%, 
75%, 95% and 99% and 100%, respectively. By experimenting with 95% 
and 99% we gain information on the performance of the test when pu 7^ p2i 
are sparse. It provides empirical checks on the potential concerns of the 
power of the simultaneous high-dimensional tests as made at the end of 
Section 3. At each percentage level of true null, three patterns of alloca- 
tion are considered for the nonzero p2l in p2 = {p2i, ■ ■ ■ >/U2p)': (i) the equal 
allocation where all the nonzero fi2i are equal; (ii) linearly increasing and 
(iii) linearly decreasing allocations as specified in Benjamini and Hochberg 
(1995). To make the power comparable among the configurations of Hi, 
we set 7] =: \\fii — P2\\^ / \/tr(S2) = 0.1 throughout the simulation. We chose 
p = 500 and 1000 and n = [201og(p)] = 124 and 138, respectively. 

Tables 1 and 2 report the empirical power and size of the four tests with 
Gamma innovations at a 5% nominal significance level or family-wise error 
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Table 1 

Empirical power and size for the 2- dependence model with Gamma innovation 



Type of 
allocation 


% of true null 




= 500, 


n = 124 




= 1000, 


n = 138 


NEW 


BS 




r ULx 


NEW 


BS 


£jUIii 


r \j w. 


J-JU Lidi 


yj /o 


0.511 


0.399 


0.13 


0.16 


0.521 


0.413 


0.11 


0.16 




/o 




n ^87 


0.14 


0.16 


n 111 8 




0.12 


0.16 




OKJ /(J 


0.513 


0.401 


0.13 


0.17 


0.531 


0.422 


0.12 


0.17 




( o /o 


0.522 


0.389 


0.13 


0.18 


0.530 


0.416 


0.11 


0.17 




c/O /O 


\J.O\J-L 




0.14 


0.16 




u.oyo 


0.13 


0.17 




99% 


0.499 


0.388 


0.13 


0.15 


0.507 


0.408 


0.15 


0.18 




100% (size) 


0.043 


0.043 


0.040 


0.041 


0.043 


0.042 


0.042 


0.042 


Increasing 


0% 


0.520 


0.425 


0.11 


0.13 


0.522 


0.409 


0.12 


0.15 




25% 


0.515 


0.431 


0.12 


0.15 


0.523 


0.412 


0.14 


0.16 




50% 


0.512 


0.412 


0.13 


0.15 


0.528 


0.421 


0.15 


0.17 




75% 


0.522 


0.409 


0.15 


0.17 


0.531 


0.431 


0.16 


0.19 




95% 


0.488 


0.401 


0.14 


0.15 


0.500 


0.410 


0.15 


0.17 




99% 


0.501 


0.409 


0.15 


0.17 


0.511 


0.412 


0.15 


0.16 




100% (size) 


0.042 


0.041 


0.040 


0.041 


0.042 


0.040 


0.039 


0.041 


Decreasing 


0% 


0.522 


0.395 


0.11 


0.15 


0.533 


0.406 


0.09 


0.15 




25% 


0.530 


0.389 


0.11 


0.15 


0.530 


0.422 


0.11 


0.17 




50% 


0.528 


0.401 


0.12 


0.17 


0.522 


0.432 


0.12 


0.17 




75% 


0.533 


0.399 


0.13 


0.18 


0.519 


0.421 


0.12 


0.17 




95% 


0.511 


0.410 


0.12 


0.15 


0.508 


0.411 


0.15 


0.18 




99% 


0.508 


0.407 


0.14 


0.15 


0.507 


0.418 


0.16 


0.17 




100% (size) 


0.041 


0.042 


0.041 


0.042 


0.042 


0.040 


0.040 


0.042 



rate or FDR based on 5000 simulations. The results for the normal inno- 
vations have a similar pattern, and are not reported here. The simulation 
results in Tables 1 and 2 can be summarized as follows. The proposed test 
is much more powerful than the Bai-Saranadasa test for all cases consid- 
ered in the simulation while maintaining a reasonably-sized approximation 
to the nominal 5% level. Both the proposed test and the Bai-Saranadasa 
test are more powerful than the two tests based on the multiple univariate 
testing using the Bonferroni and FDR procedures. This is expected as both 
the proposed and Bai-Saranadasa test are designed to test for the entire 
p-dimensional hypotheses while the multiple testing procedures are targeted 
at the individual univariate hypothesis. What is surprising is that when the 
percentage of true null is high, at 95% and 99%, the proposed test still is 
much more powerful than the two multiple testing procedures for all three 
allocations of the nonzero components in ^2- It is observed that the sparsity 
(95% and 99% true null) does reduce the power of the proposed test a little. 
However, the proposed test still enjoys good power, especially when com- 
pared with the other three tests. We also observe that when there is more 
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Table 2 

Empirical power and size for the full-dependence model with Gamma innovation 



Type of 
allocation 


% of true null 




= 500, 


n = 124 




= 1000, 


n = 138 


NEW 


BS 




r ULx 


NEW 


BS 


£jUIii 


r LJix. 


J-JU Lidi 


yj /o 


0.322 


0.120 


0.08 


0.10 


0.402 


0.216 


0.09 


0.11 




/o 


U.O -LO 


n 1 1 7 


0.08 


0.10 




n 91 s 


0.08 


0.11 




OKJ /(J 


0.316 


0.115 


0.09 


0.11 


0.409 


0.221 


0.09 


0.10 




( o /o 


0.307 


0.113 


0.10 


0.12 


0.410 


0.213 


0.09 


0.13 




c/O /O 




n 1 9S 


0.11 


0.14 


u.ouo 


n 91 ^ 


0.10 


0.13 




99% 


0.225 


0.138 


0.12 


0.15 


0.316 


0.207 


0.11 


0.12 




100% (size) 


0.041 


0.041 


0.043 


0.043 


0.042 


0.042 


0.040 


0.041 


Increasing 


0% 


0.331 


0.121 


0.09 


0.12 


0.430 


0.225 


0.10 


0.11 




25% 


0.336 


0.119 


0.10 


0.12 


0.423 


0.231 


0.12 


0.12 




50% 


0.329 


0.123 


0.12 


0.14 


0.422 


0.226 


0.13 


0.14 




75% 


0.330 


0.115 


0.12 


0.15 


0.431 


0.222 


0.14 


0.15 




95% 


0.219 


0.120 


0.12 


0.13 


0.311 


0.218 


0.14 


0.15 




99% 


0.228 


0.117 


0.13 


0.15 


0.315 


0.217 


0.15 


0.17 




100% (size) 


0.041 


0.040 


0.042 


0.043 


0.042 


0.042 


0.040 


0.042 


Decreasing 


0% 


0.320 


0.117 


0.08 


0.11 


0.411 


0.213 


0.08 


0.10 




25% 


0.323 


0.119 


0.09 


0.11 


0.408 


0.210 


0.08 


0.11 




50% 


0.327 


0.120 


0.11 


0.12 


0.403 


0.208 


0.09 


0.10 




75% 


0.322 


0.122 


0.12 


0.12 


0.400 


0.211 


0.12 


0.13 




95% 


0.217 


0.109 


0.12 


0.15 


0.319 


0.207 


0.12 


0.15 




99% 


0.224 


0.111 


0.13 


0.16 


0.327 


0.205 


0.11 


0.13 




100% (size) 


0.042 


0.043 


0.039 


0.041 


0.042 


0.211 


0.040 


0.041 



dependence among multivariate components of the data vectors in the full 
dependence model, there is a drop in the power for each of the tests. The 
power of the tests based on the Bonferroni and FDR procedures is alarmingly 
low and is only slightly larger than the nominal significance level. 

We also collected information on the quality of tr(S^) estimation. Table 

3 reports empirical averages and standard deviation of tr(S2)/tr(S^). It 
shows that the proposed estimator for tr(S^) has a much smaller bias and 
standard deviation than those proposed in Bai and Saranadasa (1996) in all 
cases, and provides an empirical verification for Theorem 2. 

5.2. Sparse model. An examination of the previous simulation setting 
reveals that the strength of the "signals" fi2i — Hii corresponding to the al- 
ternative hypotheses are low relative to the level of noise (variance) which 
may not be a favorable situation for the two tests based on multiple univari- 
ate testing. To gain more information on the performance of the tests under 
sparsity, we consider the following simulation model such that 

Xui = Zui and X2ii = + Z2ii for / = 1 , . . . , p. 
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Table 3 

Empirical averages o/ tr(E2)/ tr(E^) with standard deviations in the parentheses 



Type of 
innovation 


Type of 
dependence 


P = 


500, n = 124 




NEW 


BS 


tr(E^) 


Normal 


2-dependence 


1.03 (0.015) 


1.39 (0.016) 


3102 




Full-dependence 


1.008 (0.00279) 


1.17 (0.0032) 


35,911 


Gamma 


2-dependence 


1.03 (0.006) 


1.10 (0.007) 


14,227 




Full-dependence 


1.108 (0.0019) 


1.248 (0.0017) 


152,248 






P = 


1000,71 = 138 




Normal 


2-dependence 


0.986 (0.0138) 


1.253 (0.0136) 


6563 




Full-dependence 


0.995 (0.0026) 


1.072 (0.0033) 


76,563 


Gamma 


2-dependence 


1.048 (0.005) 


1.138 (0.006) 


32,104 




Full- dependence 


1.088 (0.00097) 


1.231 (0.0013) 


325,879 



where {Ziii,Z2ii}^^^ are mutually independent A'^(0, 1) random variables, 
and the "signals," 



■ ey^2\og{p) for / = 1, . . . , g = [p^] and /.i/ = for I > q, 



for some c € (0, 1). Here q is the number of significant alternative hypotheses. 
The sparsity of the hypotheses is determined by c: the smaller the c is, the 
more sparse the alternative hypotheses with 7^ 0. This simulation model 
is similar to the one used in Abramovich et al. (2006). 

According to (3.11), the power of the proposed test has the asymptotic 
power 

«iiMii)=*f-f<.+-'''"""'''°''<^' 



2V2 

which indicates that the test has a much reduced power if c < 1/2 with 
respect to p. We, therefore, chose p = 1000 and c = 0.25, 0.35, 0.45 and 0.55, 
respectively, which leads to g = 6,ll,22, and 44, respectively. We call c = 
0.25,0.35 and 0.45 the sparse cases. 

In order to prevent trivial powers of a or 1 in the simulation, we set 
e = 0.25 for c = 0.25 and 0.45; and e = 0.15 for c = 0.35 and 0.55. Table 4 
summarizes the simulations results based on 500 simulations. It shows that 
in the extreme sparse cases of c = 0.25, the FDR and Bonferroni tests have 
lower power than the proposed test. The power is largely similar among the 
three tests for c = 0.35. However, when the sparsity is moderated to c = 0.45, 
the proposed test starts to surpass the FDR and Bonferroni procedures. The 
gap in power performance is further increased when c = 0.55. Table 5 reports 
the quality of the variance estimation in Table 5 which shows the proposed 
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Table 4 

Empirical power and size for the sparse model 









e = 


0.25 






e = 


0.15 




Sample size 
(m = 112) 




c = 


0.25 


c = 


0.45 


c = 


0.35 


c = 


0.55 


Methods 


Power 


Size 


Power 


Size 


Power 


Size 


Power 


Size 


10 


FDR 
Bonf 
New 


0.084 
0.084 
0.100 


0.056 
0.056 
0.046 


0.180 
0.170 
0.546 


0.040 
0.040 
0.056 


0.044 
0.044 
0.072 


0.034 
0.034 
0.064 


0.066 
0.062 
0.344 


0.034 
0.032 
0.064 


20 


FDR 
Bonf 
New 


0.380 
0.368 
0.238 


0.042 
0.038 
0.052 


0.855 
0.806 
0.976 


0.044 
0.044 
0.042 


0.096 
0.092 
0.106 


0.036 
0.034 
0.052 


0.326 
0.308 
0.852 


0.058 
0.056 
0.046 


30 


FDR 
Bonfe 
New 


0.864 
0.842 
0.408 


0.042 
0.038 
0.050 


1 

0.996 
0.998 


0.060 
0.060 
0.058 


0.236 
0.232 
0.220 


0.048 
0.048 
0.054 


0.710 
0.660 
0.988 


0.038 
0.038 
0.042 



variance estimators incur very little bias and variance for even very small 
sample sizes of ni = 1x2 = 10. 



6. Technical details. 



|2 



6.1. Derivations for E{Tn) andYax[Tn)- As 

ni(ni-l) n2(n2 - 1) nin2 
it is straightforward to show that EiTn) = fJ-'ifJ-i + /U2/^2 — '^IJ'il^2 = ||/^i — ^2 
Let Pi = ^'^f P2 = ^'^'^ and P3 = -2^^=i^i^ii^. It 

?ii(ni — 1) ' ^ «2(i2 — 1) nin2 

can be shown that 

Var(PO = ^^tr(S?) + M^, 
ni(ni — 1) ni 



Table 5 

Average ratios of a\J a\i and their standard deviation (in parenthesis) for the sparse 

model 









e = 


0.25 


£ = 


0.15 


Sample 


size 


True ctI^ 


c = 0.25 


c = 0.45 


c = 0.35 


c = 0.55 


Til = n2 
ni — 712 
Til — n2 


= 10 
= 20 
= 30 


84.4 
20.5 
9.0 


1.003 (0.0123) 
1.003 (0.0033) 
0.996 (0.0013) 


1.005 (0.0116) 
1.000 (0.0028) 
0.998 (0.0013) 


0.998 (0.0120) 

1.003 (0.0028) 

1.004 (0.0014) 


0.999 (0.0110) 
1.002 (0.0029) 
0.999 (0.0013) 
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Var(P2 = TT tr S2) + 

n2(n2 - 1) n2 

and 

Var(P3) = ^tr(SiS2) + + 

711712 Til 7l2 

Because the two samples are independent, Cov(Pi,P2) = 0. Also, 

Cov(A,P3) = -^^^ and Cov(P2,P3) = -^^^. 

Til 712 

In summary, 

Var(r„) = , ^ ^. tr(S?) + , ^ ^, tr(S2) + tr(SiS2) 

?ll(77i-l) ^2(112-1) 7li7l2 

H (/ii - ^2)'Si(^i - /I2) H (/il - /l2)'S2(/il - /U2). 

Til 7^2 

Thus, under Hq, 

Var(r„) = al, =: , ^ tr(S2) + ^ tr(S2) + ^ tr(SiS2) . 
?ii(7ii-l) ?^2("-2-l) ?^l^^2 

Under Hi: fiij^ fi2, with (3.4), 

Var(r„)=cT2i{l + o(l)}; 

and with (3.5), 

Var(r„) = cT^2{l + o(l)}, 
where cj„2 = ^(mi - /^2)'Si(^i - /i2) + ;^(Aii - /X2)'S2(/Ui - /i2)- 

6.2. Asymptotic normality of T„. We note that T„ = T„i -|- T„2 where 

^ Er^,-(^i.-/^iy(^i.-m) ^ Er4,(^2.-A^2)^(^2,-/X2) 

7ii(ni-l) 712(712-1) 

, Er=iiE"ii(^i.-my(^2,-/.2) 

711712 

and 

^ 2Erii(-^ii-my(^i-^2) , 2Erii(^2^-/i2)^(/i2-/ii) 

-tn2 = 1 

711 n2 

+ /I'l/ll + IJ,2lJ-2 - 2/i'i/i2. 

It is easy to show that E(Tni) = and E{Tn2) = H/^i — Ai2|p5 and 

Var(r„2) = 4n5"^(/xi - /i2)'Si(/Ui - /12) + 47i^^(^i2 - /"i)'S2(^2 - /"i)- 
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Under (3.4), as 



Var 



.2) 



tn2 



Tn 



tn2 



= «(1), 

cr-nl 



+ Op(l). 



Under (3.5), 

(6.3) ^ , 

VVar(r„) fTn2 

As Tn2 are independent sample averages, its asymptotic normality is read- 
ily attainable as shown later. The main task of the proof is for the case under 

(3.4) when Tni is the contributor of the asymptotic distribution. From (6.1), 
in the derivation for the asymptotic normality of T„i , we can assume without 
loss of generality that /ii = /_i2 = 0. 

Let Yi = Xii for i = 1, . . . , ni and i^+m = for j = 1, . . . , 712, and for 

if z,jG{l,2,...,ni}, 
if ze{l,2,...,ni} 
and j G {ni + 1, . . . ,ni + 77-2}, 
if i,j G {ni + 1, . . . ,ni + n2}. 



ih (ni 



Define Vnj = XlLi ^ij j = 2, 3, . . . , ni + 712, S'nm = YIJL2 ^nj and J'nm = 
a{Yi,Y2, . . . , Ym} which is the a algebra generated by {Yi,Y2, . . . , Ym}- Now 

ni+n2 
Tn — 2 ^ ^ j . 



J=2 



Lemma 1. For each n, {Snm^^nm}^ri=i sequence of zero mean and 

a square integrahle martingale. 

Proof. It's obvious that J-nj-i C J-nj-, for any 1 < j <n and Snm is of 
zero mean and square integrable. We only need to show E{Snq\J^nm) = Snm 
for any q>m. We note that if j <m<n, then EiVnjlJ^nm) = l^i=i E{(j)ij\Fnm) = 

Xltl = ^nj- If i > ths'^ £'((^ij| Jnm) = E{YlYj\J^nm)- 

If i > m, as li and Yj are both independent of Fnm-, 

E{4),^\Fnm)=E{(t)^j) = Q. 

l{i<m, E{ct>ij\Fn,m) = E{YlY,\Tn,m) = Y/E{Yj) = 0. Hence, 

E{Vnj\Fn,m) = 0. 
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In summary, for q > m, E{Snq\J^nm) = I]]=i E{Vnj\J^nm) = YJjLi ^nj = 
Snm- This completes the proof of the lemma. □ 



Lemma 2. Under condition (3.4), 

.j=2 HKj 



Proof. Note that 

I \i=l / J \n,i2=l 



»2 



= E yLE{Y,Yl\Fn,-i)Y,,= Yl^E{Y,Yl)Yi 
n,j2=l ii,«2=l 
i-l ^ , 

- V Y' =^ y- 

«1,12 = 1 J ^ J 

where Sj = Si,nj = rii, for j G [1,^1] and = S2, = ^2, if j G [ni -|- 1, 
ni -I- 712]. 



Define 



ni-|-n2 

i=2 



Then 



(6.4) 



Now consider 



2ni(ni-l) 2n2(n2-l) (ni - l)(n2 - 1) 
^<{l + o(l)}. 



'ni-|-n2 J-l 



(6.5) 



I j=2 ii,i2=l ■'^ ^ J 

2<il<i2n,i2 = li3,i4 = l ^^^^1 ^ ^2 ^"^2 ^ 

+ 5^ 5^ E K n- in-- 1) K ^ 1) ^ 

j=2 11,42 = 1 13,44 = 1 ■'^ ^ ■'^ ^ J 

2£;(A) say, 
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where 

4 - V V V y.' Y Y' Y 

(6-6) " . ^ . ^ 

5 - V V V y' =^ y y' =^ y 

j=2 11,42=1 i3,M=l -^^ ^ ■'^ ■> ' 



Derivations given in Chen and Qin (2008) show 



f tr^(Sf) ^ tr^(£i) ^ tr(£f)tr(SiS2) 
\ 4nf (ni - 1)2 4n|(n2 - 1)^ (ni - l)(n2 - 1) 



^ tr(Si)tr(£iS2) ^ tr2(S2Si 



(ni - I)n2(n2 - 1) n\n2{n\ - l)(n2 - 1) 

^ tr(I]f)tr(Ei) I 

+ 2ni(ni-l)n2(n2-l) /^' + ^^'^^' 

and -B(-B) = o(cj^J. Hence, from (6.5) and (6.6), 

p. 2^ f tr2(Sf) tr2(Si) tr(£f)tr(EiS2) 

^^"^ 1 4nf(ni - 1)2 ^ 4n2(n2 - 1)2 n2(ni - l)(n2 - 1) 

(g7) ^ tr(S2)tr(SiS2) ^ tr2(S2Si) 



(ni - I)n2(n2 - 1) nin2(ni - l)(n2 - 1) 

tr(S2)tr(S2) 



1n\{n\ - I)n2(n2 - 1) 

Based on (6.4) and (6.7), 

(6.8) Var(r/„) = E{rl) - E\r]^) = o(a^J. 

Combine (6.4) and (6.8), and we have 

{'"i+"2 1 



and 



{ni+n2 1 
J2 EiV^^\^n,j-i)j = ^n' Var(7?„) = o(l). 



This completes the proof of Lemma 2. □ 
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Lemma 3. Under condition (3.4), 

ni+n2 

a-^E{V^^I{\Vnj\ > ea„J|F„,_i} ^ 0. 

i=2 

Proof. We note that 

ni+n2 n\+n2 

i=2 i=i 

for some q> 2. By choosing q = 4:, the conclusion of the lemma is true if we 
can show 

J2 ii;(y4|F„,_i)|=o«). 

We notice that 

{ni+n2 ni+n2 ni+n2 /^~^ \^ 

3=2 ) j=2 j=2 \i=l J 

The last term can be decomposed as 3Q + P where 

j=2 sjtt 

and P = 0{n-^)Y,]LT^EillE{Y^Yj)^. Now (6.9) is true if 3Q + P = 
Note that 

ni+n2 j-l 

g = o(n-8) Y 5]ii^{tr(y,i^'yiy/y,i^'y,y;)} 

j=2 s^t 

{ni j-1 ni+n2 j-1 ^ 

YY.^{y;t,,y,y;t.,y,)+ Y 
j=2 s^t 3=ni+l s^t J 

The last equation follows the similar procedure in Lemma 2 under (3.4) 



It remains to show that P = ©(n'^) ^"1+"^ S^,^} ^(y/Yj)^ = o(cj^J. 



Note that 

ni+n2 i-1 
j=2 s=l 
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"1 j-1 ni+n2 j-1 

j=2 s=l j=ni+l s=l 

{ni j — 1 ni+n2 ni 

j=2 s=l j=ni+l s=l 



ni+n2 j-l 

E E 

j=ni+l s=ni+l 



+ E E ^(^2s-ni^2j~ni)^ 



= Oin-^)iPi+P2 + P3), 
Where Pi = E]L2 E^l E{X[sXi,)\ P2 = TT.ltli TTsU E{X[,X2,-n,Y and 

j=?il + l s=ni+l 

Let us consider E{X[gX2j-niY ■ Define r'^r2 =: {vij)mxm and note the 
following facts which will be used repeatedly in the rest of the Appendix: 

m / m \ 2 

E4< E4 =tr^(rMro 

i,j=l \i,j=l / 

= tr2(S2Si), 

mm / m \ 2 

EE(4.4J< E4 =tA^2^i), 

mm m m 

,(2) ^ (2) , „ (2) ^,(2) 



E E '^nii^ni2^i2ii^i2j2 < E ^Ui^hl ^ E 



7;; : 7;: . 
Jl«2 «1«2 ' 



m 

E -a-a = tr(rlS2rir;s2ro = E^'^^^ 

= tr{(EiE2f}, 

where riEsFi = {v^fUxm and (r^SsrO^ = (7;(J)) 



mxm- 



From (3.1), 



E{x[,x2j-n,)' = E E (3 + ^f4' + E(3 + ^) E 



2 2 



+ E(3+A)E 

j'=i n5^i2 



2 2 
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+ 9 '^iijiViij2Vi2jiVi2j2 

0{tr2(S2Si)} + 0{tr(S2Si)2}. 



Then we conclude 



ni+n2 ni 



0(n-8)P2= ^[0{tr2(S25]i)} + 0{tr(S2Si)2}] 

j=ni-|-l s=l 

= 0(n-5)[0{tr2(S2Si)} + 0{tr(E2Si)2}] 

We can also prove that 0{n~^)Pi = o{a\^) and 0{n~^)P^ = o(cj^J by going 
through a similar procedure. This completes the proof of the lemma. □ 

Proof of Theorem 1. We note equations (6.2) and (6.3) under condi- 
tions (3.4) and (3.5), respectively. Based on Corollary 3.1 of Hall and Heyde 

(1980), Lemmas 1, 2 and 3, it can be concluded that Tn^jonx N{0, 1). This 
implies the desired asymptotic normality of T„ under (3.4). Under (3.5), as 
r„2 is the sum of two independent averages, its asymptotic normality can 
be attained by following the standard means. Hence the theorem is proved. 
□ 

Proof of Theorem 2. We only present the proof for the ratio con- 
sistency of tr(S^) as the proofs of the other two follow the same route. We 
want to show 

(6.10) ^{t7(sf)}=tr(Sf){l + o(l)} and Var{t7(sf)} = o{tr2(Sf )}. 

For notation simplicity, we denote Xij as Xj and Si as S, since we are 
effectively in a one-sample situation. 
Note that 

t7(S2) = {n(n-l)}-i 

n 

X tr Y,{i^J - - f^Yi^k - f^){Xk - li)' 

- 2(%fe) - /z)(X,- - /i)'(Xfc - ,i){Xk - /x)'} 

n 

+ Y.{2{X, - fi)fi'iXk - ii){Xk - fi)' 

- 2(X(,- fc) - fi)fJ.'iXk - fi)iXk - fi)'} 
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n 
n 

- 2(^(j,fc) - /^)/^'(^(i,fc) - lj){Xk - At)'} 

n 



i^k 



10 



--■.^tT{Ai), say. 



1=1 



It is easy to show that ^{tr(^i)} = iT{T?), ^{tr(^i)} =0 for z = 2, . . . ,9 
and i?{tr(Aio)} = n'T,n/{n — 2) = o{tr(S^)}. The last equation is based on 
(3.4). This leads to the first part of (6.10). Since tr(^io) is nonnegative and 
E{tr{AiQ)} = o{tr(S^)}, we have tic{AiQ) = Op{tr(S^)}. However, to estab- 
lish the orders of other terms, we need to derive Var{tr(Aj)}. We shall only 
show Var{tr(Ai)} here. Derivations for other Var{tr(^j)} are similar. 
Note that 

Var{tr(Ai)}+tr2(s2) 
= E 



1 



n'^{n — 1)2 



^ tri j^iX, - 1^){X, - f^nXk - fi)iXk - fiY I 

trJ j2 iXj,-fJ,)iX,, -^y{Xk,-tJ,)iXk, -/i)' 



^ji¥=ki 



xtrl J2 - f^)iXn - f^YiXk, - f^){Xk, - f^Y 

It can be shown, by considering the possible combinations of the subscripts 
ji ) ki , j2 and /c2 , that 



(6.11) 



Var{tr(Ai)} = 2{n{n - 1)}-'E{{Xi - /i)'(Xi - fi)}^ 



+ 



n(n — 1) 



E{{Xi - /z)'S(Xi - ^i)Y + o{iT\T.')} 
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- ^ i?ii + ^^^i?i2 + o{tr2(s2)}, 



n(n — 1) n{n — 1) 

where 

/ m \ 4 

= E{Z[T'TZ2)^ = ^{Y1 ^^sKstZ2t 

\s,t=l / 

(m 
Z^Sltl 2^32*2 ^83*3 1^84*4 ^Isi ^lS2 ^1^3 ^lS4 ^2ti ^2*2 ^2*3 2^2*4 
SliS2,S3,S4,tl,i2,i3i*4=l / 

and 

(m \ 2 

/ m \ 

— ^\ ^ ^Siti^S2i2 2^1si2lS2 2ltl2lt2 ) • 

\si,S2,tl,t2 = l / 

Here and list are, respectively, the element of T'T and T'SF. 

Since tr2(S2) = = and tr(S4) = 

t2=i ^tit2 • ^^"^ shown that An < ctr^(S^) for a finite positive num- 
ber c and hence {n(n — l)}^^Bii = o{tr^(S^)}. It may also be shown that 

mm m 

-Bi2 = 2 ^ + ^ UssUtt + A ^ ' 



• ,t=l s,t=l s=l 



m 

2 



= 2tr(s4)+tr2(s2) + A^n^^ 

< (2 + A)tr(S4)+tr2(s2). 
Therefore, from (6.11), 

Var{tr(^0} < ^^^f^^'i^') + ^^^^{(^ + A) tr(S^) + tr\j:')} 
= oK(s2)}. 
This completes the proof. □ 
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